Application of Minimal Perfect Hashing in Main Memory Indexing

نویسندگان

  • Yuk Ho
  • Jerome H. Saltzer
چکیده

With the rapid decrease in the cost of random access memory (RAM), it will soon become economically feasible to place full-text indexes of a library in main memory. One essential component of the indexing system is a hashing algorithm, which maps a keyword into the memory address of the index information corresponding to that keyword. This thesis studies the application of the minimal perfect hashing algorithm in main memory indexing. This algorithm is integrated into the index search engine of the Library 2000 system, a digital on-line library system. The performance of this algorithm is compared with that of the open-addressing hashing scheme. We find that although the minimal perfect hashing algorithm needs fewer keyword comparisons per keyword search on average, its hashing performance is slower than the open-addressing scheme. Thesis Supervisor: Jerome H. Saltzer Title: Professor, Department of Electrical Engineering and Computer Science

منابع مشابه

Indexing Internal Memory with Minimal Perfect Hash Functions

A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are wide...

متن کامل

Monotone minimal perfect hashing: searching a sorted table with O(1) accesses

A minimal perfect hash function maps a set S of n keys into the set { 0, 1, . . . , n− 1 } bijectively. Classical results state that minimal perfect hashing is possible in constant time using a structure occupying space close to the lower bound of log e bits per element. Here we consider the problem of monotone minimal perfect hashing, in which the bijection is required to preserve the lexicogr...

متن کامل

A Practical Minimal Perfect Hashing Method

We propose a novel algorithm based on random graphs to construct minimal perfect hash functions h. For a set of n keys, our algorithm outputs h in expected time O(n). The evaluation of h(x) requires two memory accesses for any key x and the description of h takes up 1.15n words. This improves the space requirement to 55% of a previous minimal perfect hashing scheme due to Czech, Havas and Majew...

متن کامل

Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions

A new way of constructing (minimal) perfect hash functions is described. The technique considerably reduces the overhead associated with resolving buckets in two-level hashing schemes. Evaluating a hash function requires just one multiplication and a few additions apart from primitive bit operations. The number of accesses to memory is two, one of which is to a fixed location. This improves the...

متن کامل

Fast and Scalable Minimal Perfect Hashing for Massive Key Sets

Minimal perfect hash functions provide space-efficient and collision-free hashing on static sets. Existing algorithms and implementations that build such functions have practical limitations on the number of input elements they can process, due to high construction time, RAM or external memory usage. We revisit a simple algorithm and show that it is highly competitive with the state of the art,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994